{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Async Ingestion Pipeline + Metadata Extraction\n",
    "\n",
    "Recently, LlamaIndex has introduced async metadata extraction. Let's compare metadata extraction speeds in an ingestion pipeline using a newer and older version of LlamaIndex.\n",
    "\n",
    "We will test a pipeline using the classic Paul Graham essay."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!mkdir -p 'data/paul_graham/'\n",
    "!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "\n",
    "os.environ[\n",
    "    \"OPENAI_API_KEY\"\n",
    "] = \"sk-..."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## New LlamaIndex Ingestion\n",
    "\n",
    "Using a version of LlamaIndex greater or equal to v0.9.7, we can take advantage of improved async metadata extraction within ingestion pipelines.\n",
    "\n",
    "**NOTE:** Restart your notebook after installing a new version!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install \"llama_index>=0.9.7\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**NOTE:** The `num_workers` kwarg controls how many requests can be outgoing at a given time using an async semaphore. Setting it higher may increase speeds, but can also lead to timeouts or rate limits, so set it wisely."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.embeddings import OpenAIEmbedding\n",
    "from llama_index.llms import OpenAI\n",
    "from llama_index.ingestion import IngestionPipeline\n",
    "from llama_index.extractors import TitleExtractor, SummaryExtractor\n",
    "from llama_index.text_splitter import SentenceSplitter\n",
    "from llama_index.schema import MetadataMode\n",
    "\n",
    "\n",
    "def build_pipeline():\n",
    "    llm = OpenAI(model=\"gpt-3.5-turbo-1106\", temperature=0.1)\n",
    "\n",
    "    transformations = [\n",
    "        SentenceSplitter(chunk_size=1024, chunk_overlap=20),\n",
    "        TitleExtractor(\n",
    "            llm=llm, metadata_mode=MetadataMode.EMBED, num_workers=8\n",
    "        ),\n",
    "        SummaryExtractor(\n",
    "            llm=llm, metadata_mode=MetadataMode.EMBED, num_workers=8\n",
    "        ),\n",
    "        OpenAIEmbedding(),\n",
    "    ]\n",
    "\n",
    "    return IngestionPipeline(transformations=transformations)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index import SimpleDirectoryReader\n",
    "\n",
    "documents = SimpleDirectoryReader(\"./data/paul_graham\").load_data()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "100%|██████████| 5/5 [00:01<00:00,  3.99it/s]\n",
      "100%|██████████| 18/18 [00:07<00:00,  2.36it/s]\n",
      "100%|██████████| 5/5 [00:01<00:00,  2.97it/s]\n",
      "100%|██████████| 18/18 [00:06<00:00,  2.63it/s]\n",
      "100%|██████████| 5/5 [00:01<00:00,  3.84it/s]\n",
      "100%|██████████| 18/18 [01:07<00:00,  3.75s/it]\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Average time: 31.196589946746826\n"
     ]
    }
   ],
   "source": [
    "import time\n",
    "\n",
    "times = []\n",
    "for _ in range(3):\n",
    "    time.sleep(30)  # help prevent rate-limits/timeouts, keeps each run fair\n",
    "    pipline = build_pipeline()\n",
    "    start = time.time()\n",
    "    nodes = await pipline.arun(documents=documents)\n",
    "    end = time.time()\n",
    "    times.append(end - start)\n",
    "\n",
    "print(f\"Average time: {sum(times) / len(times)}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The current `openai` python client package is a tad unstable -- sometimes async jobs will timeout, skewing the average. You can see the last progress bar took 1 minute instead of the previous 6 or 7 seconds, skewing the average."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Old LlamaIndex Ingestion\n",
    "\n",
    "Now, lets compare to an older version of LlamaIndex, which was using \"fake\" async for metadata extraction.\n",
    "\n",
    "**NOTE:** Restart your notebook after installing the new version!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Collecting llama_index<0.9.6\n",
      "  Obtaining dependency information for llama_index<0.9.6 from https://files.pythonhosted.org/packages/ac/3c/dee8ec4fecaaeabbd8a61ade9ddb6af09d05553c2a0acbebd1b559eaeb30/llama_index-0.9.5-py3-none-any.whl.metadata\n",
      "  Downloading llama_index-0.9.5-py3-none-any.whl.metadata (8.2 kB)\n",
      "Requirement already satisfied: SQLAlchemy[asyncio]>=1.4.49 in /home/loganm/.cache/pypoetry/virtualenvs/llama-index-4a-wkI5X-py3.11/lib/python3.11/site-packages (from llama_index<0.9.6) (2.0.23)\n",
      "Requirement already satisfied: aiohttp<4.0.0,>=3.8.6 in /home/loganm/.cache/pypoetry/virtualenvs/llama-index-4a-wkI5X-py3.11/lib/python3.11/site-packages (from llama_index<0.9.6) (3.8.6)\n",
      "Requirement already satisfied: aiostream<0.6.0,>=0.5.2 in /home/loganm/.cache/pypoetry/virtualenvs/llama-index-4a-wkI5X-py3.11/lib/python3.11/site-packages (from llama_index<0.9.6) (0.5.2)\n",
      "Requirement already satisfied: beautifulsoup4<5.0.0,>=4.12.2 in /home/loganm/.cache/pypoetry/virtualenvs/llama-index-4a-wkI5X-py3.11/lib/python3.11/site-packages (from llama_index<0.9.6) (4.12.2)\n",
      "Requirement already satisfied: dataclasses-json in /home/loganm/.cache/pypoetry/virtualenvs/llama-index-4a-wkI5X-py3.11/lib/python3.11/site-packages (from llama_index<0.9.6) (0.5.14)\n",
      "Requirement already satisfied: deprecated>=1.2.9.3 in /home/loganm/.cache/pypoetry/virtualenvs/llama-index-4a-wkI5X-py3.11/lib/python3.11/site-packages (from llama_index<0.9.6) (1.2.14)\n",
      "Requirement already satisfied: fsspec>=2023.5.0 in /home/loganm/.cache/pypoetry/virtualenvs/llama-index-4a-wkI5X-py3.11/lib/python3.11/site-packages (from llama_index<0.9.6) (2023.10.0)\n",
      "Requirement already satisfied: httpx in /home/loganm/.cache/pypoetry/virtualenvs/llama-index-4a-wkI5X-py3.11/lib/python3.11/site-packages (from llama_index<0.9.6) (0.24.1)\n",
      "Requirement already satisfied: nest-asyncio<2.0.0,>=1.5.8 in /home/loganm/.cache/pypoetry/virtualenvs/llama-index-4a-wkI5X-py3.11/lib/python3.11/site-packages (from llama_index<0.9.6) (1.5.8)\n",
      "Requirement already satisfied: nltk<4.0.0,>=3.8.1 in /home/loganm/.cache/pypoetry/virtualenvs/llama-index-4a-wkI5X-py3.11/lib/python3.11/site-packages (from llama_index<0.9.6) (3.8.1)\n",
      "Requirement already satisfied: numpy in /home/loganm/.cache/pypoetry/virtualenvs/llama-index-4a-wkI5X-py3.11/lib/python3.11/site-packages (from llama_index<0.9.6) (1.24.4)\n",
      "Requirement already satisfied: openai>=1.1.0 in /home/loganm/.cache/pypoetry/virtualenvs/llama-index-4a-wkI5X-py3.11/lib/python3.11/site-packages (from llama_index<0.9.6) (1.3.5)\n",
      "Requirement already satisfied: pandas in /home/loganm/.cache/pypoetry/virtualenvs/llama-index-4a-wkI5X-py3.11/lib/python3.11/site-packages (from llama_index<0.9.6) (2.0.3)\n",
      "Requirement already satisfied: tenacity<9.0.0,>=8.2.0 in /home/loganm/.cache/pypoetry/virtualenvs/llama-index-4a-wkI5X-py3.11/lib/python3.11/site-packages (from llama_index<0.9.6) (8.2.3)\n",
      "Requirement already satisfied: tiktoken>=0.3.3 in /home/loganm/.cache/pypoetry/virtualenvs/llama-index-4a-wkI5X-py3.11/lib/python3.11/site-packages (from llama_index<0.9.6) (0.5.1)\n",
      "Requirement already satisfied: typing-extensions>=4.5.0 in /home/loganm/.cache/pypoetry/virtualenvs/llama-index-4a-wkI5X-py3.11/lib/python3.11/site-packages (from llama_index<0.9.6) (4.8.0)\n",
      "Requirement already satisfied: typing-inspect>=0.8.0 in /home/loganm/.cache/pypoetry/virtualenvs/llama-index-4a-wkI5X-py3.11/lib/python3.11/site-packages (from llama_index<0.9.6) (0.8.0)\n",
      "Requirement already satisfied: urllib3<2 in /home/loganm/.cache/pypoetry/virtualenvs/llama-index-4a-wkI5X-py3.11/lib/python3.11/site-packages (from llama_index<0.9.6) (1.26.18)\n",
      "Requirement already satisfied: attrs>=17.3.0 in /home/loganm/.cache/pypoetry/virtualenvs/llama-index-4a-wkI5X-py3.11/lib/python3.11/site-packages (from aiohttp<4.0.0,>=3.8.6->llama_index<0.9.6) (23.1.0)\n",
      "Requirement already satisfied: charset-normalizer<4.0,>=2.0 in /home/loganm/.cache/pypoetry/virtualenvs/llama-index-4a-wkI5X-py3.11/lib/python3.11/site-packages (from aiohttp<4.0.0,>=3.8.6->llama_index<0.9.6) (3.3.2)\n",
      "Requirement already satisfied: multidict<7.0,>=4.5 in /home/loganm/.cache/pypoetry/virtualenvs/llama-index-4a-wkI5X-py3.11/lib/python3.11/site-packages (from aiohttp<4.0.0,>=3.8.6->llama_index<0.9.6) (6.0.4)\n",
      "Requirement already satisfied: async-timeout<5.0,>=4.0.0a3 in /home/loganm/.cache/pypoetry/virtualenvs/llama-index-4a-wkI5X-py3.11/lib/python3.11/site-packages (from aiohttp<4.0.0,>=3.8.6->llama_index<0.9.6) (4.0.3)\n",
      "Requirement already satisfied: yarl<2.0,>=1.0 in /home/loganm/.cache/pypoetry/virtualenvs/llama-index-4a-wkI5X-py3.11/lib/python3.11/site-packages (from aiohttp<4.0.0,>=3.8.6->llama_index<0.9.6) (1.9.2)\n",
      "Requirement already satisfied: frozenlist>=1.1.1 in /home/loganm/.cache/pypoetry/virtualenvs/llama-index-4a-wkI5X-py3.11/lib/python3.11/site-packages (from aiohttp<4.0.0,>=3.8.6->llama_index<0.9.6) (1.4.0)\n",
      "Requirement already satisfied: aiosignal>=1.1.2 in /home/loganm/.cache/pypoetry/virtualenvs/llama-index-4a-wkI5X-py3.11/lib/python3.11/site-packages (from aiohttp<4.0.0,>=3.8.6->llama_index<0.9.6) (1.3.1)\n",
      "Requirement already satisfied: soupsieve>1.2 in /home/loganm/.cache/pypoetry/virtualenvs/llama-index-4a-wkI5X-py3.11/lib/python3.11/site-packages (from beautifulsoup4<5.0.0,>=4.12.2->llama_index<0.9.6) (2.5)\n",
      "Requirement already satisfied: wrapt<2,>=1.10 in /home/loganm/.cache/pypoetry/virtualenvs/llama-index-4a-wkI5X-py3.11/lib/python3.11/site-packages (from deprecated>=1.2.9.3->llama_index<0.9.6) (1.15.0)\n",
      "Requirement already satisfied: click in /home/loganm/.cache/pypoetry/virtualenvs/llama-index-4a-wkI5X-py3.11/lib/python3.11/site-packages (from nltk<4.0.0,>=3.8.1->llama_index<0.9.6) (8.1.7)\n",
      "Requirement already satisfied: joblib in /home/loganm/.cache/pypoetry/virtualenvs/llama-index-4a-wkI5X-py3.11/lib/python3.11/site-packages (from nltk<4.0.0,>=3.8.1->llama_index<0.9.6) (1.3.2)\n",
      "Requirement already satisfied: regex>=2021.8.3 in /home/loganm/.cache/pypoetry/virtualenvs/llama-index-4a-wkI5X-py3.11/lib/python3.11/site-packages (from nltk<4.0.0,>=3.8.1->llama_index<0.9.6) (2023.10.3)\n",
      "Requirement already satisfied: tqdm in /home/loganm/.cache/pypoetry/virtualenvs/llama-index-4a-wkI5X-py3.11/lib/python3.11/site-packages (from nltk<4.0.0,>=3.8.1->llama_index<0.9.6) (4.66.1)\n",
      "Requirement already satisfied: anyio<4,>=3.5.0 in /home/loganm/.cache/pypoetry/virtualenvs/llama-index-4a-wkI5X-py3.11/lib/python3.11/site-packages (from openai>=1.1.0->llama_index<0.9.6) (3.7.1)\n",
      "Requirement already satisfied: distro<2,>=1.7.0 in /home/loganm/.cache/pypoetry/virtualenvs/llama-index-4a-wkI5X-py3.11/lib/python3.11/site-packages (from openai>=1.1.0->llama_index<0.9.6) (1.8.0)\n",
      "Requirement already satisfied: pydantic<3,>=1.9.0 in /home/loganm/.cache/pypoetry/virtualenvs/llama-index-4a-wkI5X-py3.11/lib/python3.11/site-packages (from openai>=1.1.0->llama_index<0.9.6) (1.10.12)\n",
      "Requirement already satisfied: certifi in /home/loganm/.cache/pypoetry/virtualenvs/llama-index-4a-wkI5X-py3.11/lib/python3.11/site-packages (from httpx->llama_index<0.9.6) (2023.7.22)\n",
      "Requirement already satisfied: httpcore<0.18.0,>=0.15.0 in /home/loganm/.cache/pypoetry/virtualenvs/llama-index-4a-wkI5X-py3.11/lib/python3.11/site-packages (from httpx->llama_index<0.9.6) (0.17.3)\n",
      "Requirement already satisfied: idna in /home/loganm/.cache/pypoetry/virtualenvs/llama-index-4a-wkI5X-py3.11/lib/python3.11/site-packages (from httpx->llama_index<0.9.6) (3.4)\n",
      "Requirement already satisfied: sniffio in /home/loganm/.cache/pypoetry/virtualenvs/llama-index-4a-wkI5X-py3.11/lib/python3.11/site-packages (from httpx->llama_index<0.9.6) (1.3.0)\n",
      "Requirement already satisfied: greenlet!=0.4.17 in /home/loganm/.cache/pypoetry/virtualenvs/llama-index-4a-wkI5X-py3.11/lib/python3.11/site-packages (from SQLAlchemy[asyncio]>=1.4.49->llama_index<0.9.6) (3.0.1)\n",
      "Requirement already satisfied: requests>=2.26.0 in /home/loganm/.cache/pypoetry/virtualenvs/llama-index-4a-wkI5X-py3.11/lib/python3.11/site-packages (from tiktoken>=0.3.3->llama_index<0.9.6) (2.31.0)\n",
      "Requirement already satisfied: mypy-extensions>=0.3.0 in /home/loganm/.cache/pypoetry/virtualenvs/llama-index-4a-wkI5X-py3.11/lib/python3.11/site-packages (from typing-inspect>=0.8.0->llama_index<0.9.6) (1.0.0)\n",
      "Requirement already satisfied: marshmallow<4.0.0,>=3.18.0 in /home/loganm/.cache/pypoetry/virtualenvs/llama-index-4a-wkI5X-py3.11/lib/python3.11/site-packages (from dataclasses-json->llama_index<0.9.6) (3.20.1)\n",
      "Requirement already satisfied: python-dateutil>=2.8.2 in /home/loganm/.cache/pypoetry/virtualenvs/llama-index-4a-wkI5X-py3.11/lib/python3.11/site-packages (from pandas->llama_index<0.9.6) (2.8.2)\n",
      "Requirement already satisfied: pytz>=2020.1 in /home/loganm/.cache/pypoetry/virtualenvs/llama-index-4a-wkI5X-py3.11/lib/python3.11/site-packages (from pandas->llama_index<0.9.6) (2023.3.post1)\n",
      "Requirement already satisfied: tzdata>=2022.1 in /home/loganm/.cache/pypoetry/virtualenvs/llama-index-4a-wkI5X-py3.11/lib/python3.11/site-packages (from pandas->llama_index<0.9.6) (2023.3)\n",
      "Requirement already satisfied: h11<0.15,>=0.13 in /home/loganm/.cache/pypoetry/virtualenvs/llama-index-4a-wkI5X-py3.11/lib/python3.11/site-packages (from httpcore<0.18.0,>=0.15.0->httpx->llama_index<0.9.6) (0.14.0)\n",
      "Requirement already satisfied: packaging>=17.0 in /home/loganm/.cache/pypoetry/virtualenvs/llama-index-4a-wkI5X-py3.11/lib/python3.11/site-packages (from marshmallow<4.0.0,>=3.18.0->dataclasses-json->llama_index<0.9.6) (23.2)\n",
      "Requirement already satisfied: six>=1.5 in /home/loganm/.cache/pypoetry/virtualenvs/llama-index-4a-wkI5X-py3.11/lib/python3.11/site-packages (from python-dateutil>=2.8.2->pandas->llama_index<0.9.6) (1.16.0)\n",
      "Downloading llama_index-0.9.5-py3-none-any.whl (893 kB)\n",
      "\u001b[2K   \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m893.9/893.9 kB\u001b[0m \u001b[31m6.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0ma \u001b[36m0:00:01\u001b[0m\n",
      "\u001b[?25hInstalling collected packages: llama_index\n",
      "  Attempting uninstall: llama_index\n",
      "    Found existing installation: llama-index 0.9.8.post1\n",
      "    Uninstalling llama-index-0.9.8.post1:\n",
      "      Successfully uninstalled llama-index-0.9.8.post1\n",
      "\u001b[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.\n",
      "trulens-eval 0.18.0 requires llama-index==0.8.69, but you have llama-index 0.9.5 which is incompatible.\n",
      "trulens-eval 0.18.0 requires typing-extensions==4.5.0, but you have typing-extensions 4.8.0 which is incompatible.\u001b[0m\u001b[31m\n",
      "\u001b[0mSuccessfully installed llama_index-0.9.5\n",
      "\n",
      "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m23.2.1\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m23.3.1\u001b[0m\n",
      "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n"
     ]
    }
   ],
   "source": [
    "!pip install \"llama_index<0.9.6\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "\n",
    "os.environ[\"OPENAI_API_KEY\"] = \"sk-...\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.embeddings import OpenAIEmbedding\n",
    "from llama_index.llms import OpenAI\n",
    "from llama_index.ingestion import IngestionPipeline\n",
    "from llama_index.extractors import TitleExtractor, SummaryExtractor\n",
    "from llama_index.text_splitter import SentenceSplitter\n",
    "from llama_index.schema import MetadataMode\n",
    "\n",
    "\n",
    "def build_pipeline():\n",
    "    llm = OpenAI(model=\"gpt-3.5-turbo-1106\", temperature=0.1)\n",
    "\n",
    "    transformations = [\n",
    "        SentenceSplitter(chunk_size=1024, chunk_overlap=20),\n",
    "        TitleExtractor(llm=llm, metadata_mode=MetadataMode.EMBED),\n",
    "        SummaryExtractor(llm=llm, metadata_mode=MetadataMode.EMBED),\n",
    "        OpenAIEmbedding(),\n",
    "    ]\n",
    "\n",
    "    return IngestionPipeline(transformations=transformations)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index import SimpleDirectoryReader\n",
    "\n",
    "documents = SimpleDirectoryReader(\"./data/paul_graham\").load_data()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "6c78c49429784a37b06e5dc3624953c6",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Extracting titles:   0%|          | 0/5 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "91c075a5d32a44db8bf620642b1aa76e",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Extracting summaries:   0%|          | 0/18 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "f7370d84200d46dea99fc32ba3bdbab8",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Extracting titles:   0%|          | 0/5 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "a27e54696a9540a6abe9a14c057dd870",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Extracting summaries:   0%|          | 0/18 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "963362eaf9964339b29cb4c752ec3920",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Extracting titles:   0%|          | 0/5 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "5ef20d347ba94f959f3a45c121aacf47",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Extracting summaries:   0%|          | 0/18 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Average time: 106.17690531412761\n"
     ]
    }
   ],
   "source": [
    "import time\n",
    "\n",
    "times = []\n",
    "for _ in range(3):\n",
    "    time.sleep(30)  # help prevent rate-limits/timeouts, keeps each run fair\n",
    "    pipline = build_pipeline()\n",
    "    start = time.time()\n",
    "    nodes = await pipline.arun(documents=documents)\n",
    "    end = time.time()\n",
    "    times.append(end - start)\n",
    "\n",
    "print(f\"Average time: {sum(times) / len(times)}\")"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "llama-index-4a-wkI5X-py3.11",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
